Pronunciation variation speech recognition without dictionary modification on sparse database

نویسندگان

  • Supphanat Kanokphara
  • Virongrong Tesprasit
  • Rachod Thongprasirt
چکیده

Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined lexicon cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. Sharing gaussian densities across phonetic models and decision tree for pronunciation variation are proved to be efficient for pronunciation variation system without dictionary modification. This paper presents the alternative methods that can be used even in the sparse database situation. Re-label training is modified to have rule-based pronunciation variation in order to obtain real phonetic acoustic models. Phonemic acoustic models are then retrained from the tying HMM states across phonetic models. These new phonemic models allow alternative search path during recognition. The system shows better performance in the experiment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pronunciation Variation Speech Recognition without New Dictionary Construction

Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined dictionary cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. This paper presents efficient strategies for ...

متن کامل

Automatic segmentation and clustering of speech using sparse coding

We investigate the application of sparse coding and dictionary learning to the discovery of sub-word units in speech. The ultimate goal is to generate pronunciation dictionaries that could be used for automatic speech recognition (ASR). A dictionary of sparse coding atoms is trained to code a subset of the TIMIT corpus. Some of the trained units exhibit strong correlation with specific referenc...

متن کامل

Modeling Pronunciation Variation for Cantonese Speech Recognition

Due to the large variability of pronunciation in spontaneous speech, pronunciation modeling becomes a more challenging and essential part in speech recognition. In this paper, we describe two different approaches of pronunciation modeling by using decision tree. At lexical level, a pronunciation variation dictionary is built to obtain alternative pronunciations for each word, in which each entr...

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Improving pronunciation modeling for non-native speech recognition

In this paper, three different approaches to pronunciation modeling are investigated. Two existing pronunciation modeling approaches, namely the pronunciation dictionary and n-best rescoring approach are modified to work with little amount of non-native speech. We also propose a speaker clustering approach, which capable of grouping the speakers based on their pronunciation habits. Given some s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003